Automatic Evaluation of Wordnet Synonyms and Hypernyms

نویسندگان

  • Raghuvar Nadig
  • J. Ramanand
  • Pushpak Bhattacharyya
  • Rajiv Gandhi
چکیده

In recent times, wordnets have become indispensable resources for Natural Language Processing. However, the creation of wordnets is a time consuming and manpower intensive proposition. This fact has led to attempts at quickly fixing a wordnet using text repositories such as the web and certain corpora, and also by translating an existing wordnet into another language. However, the results of such attempts are often far from ideal, in the sense that the wordnet so produced contains synsets that have outlier words and/or missing words. Additionally, semantic relations may be inappropriately set up or may be missing altogether. This has necessitated investigations into automatic methodologies of wordnet evaluation. This is very much in line with modern NLP’s insistence on concrete evaluation methodologies. To the best of our knowledge, the work reported here is the first attempt at an automatic method of wordnet evaluations. We focus on verifying synonymy within non-singleton synsets and also on hypernymy between synsets. Assuming the Princeton WordNet to be the gold standard, our method is shown to validate 70% of all non-singleton synsets and about the same proportion of hypernymy-hyponymy pairs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

Evaluating WordNet Features in Text Classification Models

Incorporating semantic features from the WordNet lexical database is among one of the many approaches that have been tried to improve the predictive performance of text classification models. The intuition behind this is that keywords in the training set alone may not be extensive enough to enable generation of a universal model for a category, but if we incorporate the word relationships in Wo...

متن کامل

Indexing with WordNet Synonyms May Improve Retrieval Results

This paper describes a method developed for the Robust Word Sense Disambiguation task at CLEF 2009. In our approach, a WordNet expanded index is generated from the disambiguated document collection. This index contains synonyms, hypernyms and holonyms of the disambiguated words contained in documents. Query words are integrated by terms extracted by means of a pseudo relevance feedback techniqu...

متن کامل

Geo-WordNet: Automatic Georeferencing of WordNet

WordNet has been used extensively as a resource for the Word Sense Disambiguation (WSD) task, both as a sense inventory and a repository of semantic relationships. Recently, we investigated the possibility to use it as a resource for the Geographical Information Retrieval task, more specifically for the toponym disambiguation task, which could be considered a specialization of WSD. We found tha...

متن کامل

WordNet-Based Text Document Clustering

Text document clustering can greatly simplify browsing large collections of documents by reorganizing them into a smaller number of manageable clusters. Algorithms to solve this task exist; however, the algorithms are only as good as the data they work on. Problems include ambiguity and synonymy, the former allowing for erroneous groupings and the latter causing similarities between documents t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008